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Abstract 


Feature extraction forms a very important part of image processing. In this 
thesis we have considered various aspects of feature extraction, viz, compression, 
noise removal, edge charecterisation and local rescaling. Emphasis here has been on 
image compression and noise removal. We have used wavelet packets as the tool for 
achieving the same. 

For the purpose of image compression, we have used ‘best-basis’ algorithm using 
entropy as the cost function. Selection of the ‘best-basis’ is followed by a thresholding 
based on an energy criterion. A variety of mother wavelet functions have been 
used and it has been observed that the compression ratio achieved using the above 
algorithm depends on the wavelet function. For noise removal, we make use of 
‘visual analysis’ technique. The noise on the channel is modeled and anlaysed using 
wavelet packets. The noise energy distribution is calculated in the various subspaces 
and at various levels. The ‘subpace-energy’ plot so obtained gives a clear picture of 
noise distribution in various subspaces. The ‘noisy signal’ received on this channel is 
analysed in the similar fashion and a similar plot is obtained. The subspaces, which 
correspond to high noise energy and low signal energy can be discriminated visually 
. To the rest of the subspaces we apply a threshold and drop all the coefficients 
below that threshold. The above threshold is calculated using ‘hypothesis testing’ 
technique. The noise removal upto 75 percent has been achieved. 

The wavelet packet decomposition of an image has also been used for edge en- 
hancement and edge detection by rescaling ‘average signal’ or ‘detail signal’ depend- 
ing on the requirement. 
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Chapter 1 


Introduction 


1.1 Importance of Feature Extraction 

In analyzing and interpreting signals such as musical recordings, seismic signals, 
or stock market fluctuations , or images such as mammograms or satellite im- 
ages, extracting relevant features from them is of vital importance. Often, the impor- 
tant features of signal analysis, such as edges, spikes or transients, are characterized 
by local information either in the time (or space)domain or in the frequency (or 
wave number)or both: for example, to discriminate seismic signals caused by natu- 
ral earth quakes, the frequency charecterstics of primary waves, which arrive in a short 
and specific time window, may be a key fac.tor;to distinguish benign and malignant 
tissues in mammograms, the sharpness of the edges of masses may be of critical 
importance. 

In this thesis we explore how to extract relevant features from signals and discard 
irrelevant information for a variety of problems in signal analysis. The term feature 
extraction covers following aspects of signal analysis: 

Compression : how to represent and describe signals in compact manner for 
information transmission and storage. 
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Noise removal : how to remove random noise or undesired components from 
signals(also called discard). 

Classification : how to classify signals into categories. 

Regression : how to predict a response of interest from input signals. 

Edge Characerization : how to detect singularities(e.g., spikes or edges)in 
signals and characterize them. 

In the present thesis, the work has been restricted to compression, noise removal, 
and edge detection. 

Various methods have been used for the purposes of feature extraction. But it 
seems quite logical that, as far as images are concerned, the method that closely 
mimics the human visual system (HVS), should give comparatively better results. 
Use of wavelets fall in this category. In this thesis, we have gone a step further and 
have used wavelet packets for the same purpose. 

The methods we develop here can be applied to many different types of signals 
and images. The signals and images treated in this thesis are all discrete: they are 
simply matrices consisting of finite number of real valued samples. These signals 
normally have very large number of samples. Therefore, extracting only important 
features for the problem at hand and discarding irrelevant information(this is called 
reduction of dimensionality) are crucial. 

1.2 Overview of The Thesis 

This thesis explores some of the feature extraction and signal analysis problems 
using the wavelet packets and best-basis paradigm. The emphasis here has been on 
compression and noise removal. The thesis is organised as follows. 

Chapter-1 gives a brief introduction of the entire work. It introduces the term 
“Local Feature Extraction” and also covers the various aspects covered by this term. 
It also covers, in brief, the importance of feature extraction. 
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Chapter-2 reveiws the concept of multiresolution analysis. It illustrates the Mai- 
lat s decomposition and reconstruction algorithm. It also discusses reconstruction 
error involved in decomposition and reconstruction using the above algorithm. 

It also discusses the formation of a library of orthonormal bases using wavelet 
and wavelet packet bases. It introduces terms like, library, dictionary and discusses 
Information cost function and Best-basis selection using information cost function. 
Basiscally cost function used for best basis selection is entropy. 

In this chapter we also discuss two dimension wavelet packets, notion of entropy 
while selecting best basis and application of best basis criterion to images. 

Chapter-3 discusses the use of wavelet packets and best basis m image compres- 
sion, noise removal, edge detection and local image rescaling. In this chapter we 
also discuss the results achieved. 

Chapter-4 concludes the thesis with some suggestions for the future work. 
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Chapter 2 


Review of Multiresolution 
Analysis 

2.1 The Basic Idea 

Let Ayf(x) be the operator which approximates a signal at a resolution 2LWe 
suppose that our original signal / (x) is measurable and has finite energy :f(x ) € 
L 2 (R). Here we characterize Ay from the intutive properties that one would expect 
from such an approximation operator. 

1. Ay is a linear opertor If Ayf(x ) is the approximation of some function f(x) 
at the resolution 2 J ,then A v f(x ) is not modified if we approximate it again at the 
resolution 2 J . This principle shows that Ay o Ay = Ay . The operator Ay , is thus 
a projection operator on a particular vector space Vy C L 2 (R ). The vector space Vy 
can be interpreted as the set of all possible approximations at the resolution 2 J of 
functions in L 2 (R). 

2. Among all the approximated functions at the resolution 2 J , Ay f(x) is the 
function which is most similar to f(x). 

ty(*) G Vy, \\g{x) — /(a , )|| > \\A v f(x)-f(x)\\. (2.1) 

Hence, the operator Ay is an orthogonal projection on the vector space Vy. 
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3. Contamm.eni The approximation of a signal at a resolution 2- 7+1 contains 
all the necessary information to compute the same signal at a smaller resolution. 
This is a causality property. Since A 21 is a projection operator on V 2J this principle 
is equivalent to 

Vi ez,v v c v v+ i (2.2) 

4- An approximation operation is similar at all resolutions The spaces ap- 
proximated should be thus derived from one another by scaling each approximated 
function by the ratio of their resolution values 

V; G Z, f(x) G V 2J & /(2s) G V v+ i . (2.3) 

5. The approximation A 2] of a signal f(x ) can be charecterized by 2° samples 
per length unit When f(x) is translated by a length proportional to 2~\ A 2 jf(x ) is 
translated by the same amount and ischarecterized by the same samples which have 
been translated. As a consequence of (3) , it is sufficient to express this principle 
for the resolution j = 0. The mathematical translations consist of the following 

• Discrete charecterization. There exists an isomorphism / from Vj onto J 2 (Z). 

• Translation of the approximation 

Wk G Z, Aifk(x) = A 1 f(x - fc), where f k (x) = f(x - k) (2.4) 

• Translaton of samples: 

= (».).6Z » I(A,(h(x)) = (a,- t ),£Z (2.5) 

6. When computing an approximation of f(x) at resolution 2 J some information 
about f(x) s lost. However as the resolution increases to +00 the approximated 
signal should converge to the original signal. Conversely as the resolution deceases 
to zero, the approximated signal contains less and less information and converges to 
zero. 
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Since the approximated signal at a resolution 2 J is equal to the orthogonal pro- 
jection on a space V v , this principle can be written as 

+oo 

lim V v = M V 2J is dense in L 2 (R) (2-6) 

j — > 4 -oo 

j=—oo 

and 

4 -oo 

hm^ V v = f 1 V v — {0} (2.7) 

We call any set of vector spaces which satisfies the above properties a multiresolution 
approximation of L 2 (R). 

It has been seen that the approximation operator A v is an orthogonal projection 
on the vector space V v . In order to numerically charecterize this operator, we must 
find an orthonormal basis of V 2 ,. Theorem (2.1) shows that such an orthonormal 
basis can be defined by dialating and translating a unique function 6{x). 

Theorem 2.1 Let (V 2 :) : ^z be a multiresolution approximation of L 2 (R) . There 
exists a unique function 4>(x) € L 2 (R), called scaling function, such that if w r e set 
<j>v(x) = 2-V(2 J :c) for j G Z (the dilation of f(x) by 2 J ), then 

(V¥J<t> 2] (x - 2 ~ J n)) ne2 ^ is an orthonormal basis of V v . (2-8) 

The proof of this theorem can be found in [l].The theorem shows that an orthonor- 
mal basis of any V 2J can be built by dialating a function <f>(x) with a coefficient 2 3 
and translating a the resulting function on a grid whose interval is proportional to 
2~ J . The functions <f> v { x ) are normalised with respect to the Lf(R) norm. The 
coefficient \/2~ J appears in the basis set in order to normalize the functions in order 
to normalize the functions in the L 2 (R) norm. For a given multiresolution approxi- 
mation (Vy )jez-, there exists a unique scaling function <j>{x) which satisfies the above 
equation. However, for different multiresolution approximation scaling functions are 
different. 

The orthogonal projection on V v can now be computed by decomposing the 
signal f(x) on the orthonormal basis given by Theorem (2.1). That is, 

4-00 

Vf(x)eL 2 (R),A 2] f(x) = 2-’ £ ( /(u),<M«-2- J n))<M*-2- J n )• (2.9) 

n= — 00 
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The algorithm for determining discrete approximation fo the signal f(x ) at res- 
olution 2° is disussed im Mallat’s decomposition algorithm in the next section. 
Since <t>(x) is a low-pass filter, this discrete signal can be interpreted as a low- 
pass filtering of f(x) followed by a uniform sampling at the rate 2 J . The scaling 
function <j>{x) forms a very particular low-pass flter since the family of functions 
(v/2 1 ^ (x — 2 ~ J n ». ^ z is an orthonormal family. 

Before we consider the wavelet representation, one more theorem which will be 
used later on is mentioned below(the proof of the same is given in [1]): 

Theorem 2.2 Let 4>{x) be a scaling function, and let H be a discrete filter with 
impulse response h(n) = (<f> 2 -i(u),<f>(u — n)) Let H(u> ) be the Fourier Series defined 

by 

+oo 

H{to)= J2 h(n)e~ inuj . (2.10) 

n=— <x» 

H(u) satisfies the following two properties: 

| H( 0) |= 1 and h(n) = 0(n~ 2 ) at infinity (2-11) 

| H{ia) | +| H(u + tt) | 2 = 1 (2.12) 

Conversely let H(u>) be a Fourier series satisfying above equation and such that 

| H(u) & 0 for u € [0, tt/2] (2.13) 

The function defined by 

+oo 

(j>(u) = n H( 2~ p u) (2.14) 

p=i 

is the Fourier transform of scaling function. In he next section we will see the wavelet 
representation of MRA. 

Our aim is to build a multiresolution representation based on the differences 
of information available at two succesive resolutions 2 J+1 and 2- 7 . This difference 
of information is called the detail signal at the resolution 2 3 . The approximation 
at the resolution 2 3+1 and 2 J of the signal are respectively equal to its orthogonal 
projection on V 23 +i and V 2 j- By applying the projection theorem , we can easily 
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show that the detail signal at the resolution V is given by the orthogonal projection 
of the original signal on the orthogonal complement of V 2J in V 23 +i. Let 0 23 be this 
complement, i.e. 0 23 is orthogonal to V 23 , 

0 23 © V 23 = V 2 J +1 . 

To compute the orthogonal projection of a function f(x ) on 0 23 , we need to find 
an orthonormal basis of 0 23 . Theorem (2.3) shows that such a basis can be built by 
scaling and translating a function 0(z). 

Theorem 2.3 Let {V 2J ) :G z t> e a multiresolution vector space sequence, <f>{x) the 
scaling function, and H the corresponding conjogate filter. Let ip(x) be a function 
whose Fourier transform is given by 

V’M = G with G( a;) = (2.15) 

Let rj' 23 (rr) = 2 j V , (2- ? t) denote the dilation of r/>(x) by 2 J then {y/2~ 3 ^ 23 {x — 2~ J n))^ 
is an orthonormal basis of 0 23 and {\/2~^^) 23 {x — 2~ J n)^ ^ is an orthonormal 

basis of L 2 (R). 

ifi(x) is called an orthogonal wavelet. 


2.2 Introduction to Library of Orthonormal Bases 

Having reveiwed the concepts of MRA now we will look into the Library of orthonor- 
mal bases thus formed. We will also reveiw “best-basis” algorithm of Coifman and 
Wickerhauser which was developed mainly for signal compression and compare w'ith 
the well known Karhuen-Loeve basis . 

Throughout this thesis, we consider real-valued discrete signals with finite 
length n(= 2 no ). To focus our attention on our main theme, we assume the peri- 
odic boundry condition on the signals;a signal x = (xk)1Zo is extended periodically 
beyond the interval [0,n-l] with x k = x k (mod n) for any k € Z if necessary. 

If one is concerned with the discontinuies created by the periodic boundry con- 
dition, their effects can be reduced by considering an evenly-folded version x' = 
(x 0 , . . . , x n — 1 , x n _i, . . . , x Q )oi period 2n. 
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2.3 Wavelet Bases 


The wavelet transform can be considered as a smooth partition of frequency axis. 
The signal is first decomposed into low and high frquency bands by the convolution 
-subsampling operations with the pair consisting of ’’lowpass” filter and a 

“highpass” filter {gk)kZ o directly on the discrete time domain. Let / = {fk}k=o be 
a real valued vector of even length K . Let H and G be the convolution-subsampling 
operators using these filters which are defined as : 

( Hf)k — 53 hif 2 k+i{Gf)k = 53 9ihk+i ( 2 . 16 ) 

1=0 1=0 

for k = 0,1,... A — 1. Because of the periodic boundry condition on f(whose 
period is K), the filtered sequences Hf and G f are also periodic with period K/2. 
Their adjoint opera.tions(i.e., upsampling-anticonvolution)//* and G* are defined as 

£ h-21 fi (G’f)i= £ Sk-*fi (2.17) 

CKA.-2 KL 0<k-2l<L 

for A: = 0,1,... 2K — 1. The operators H and G are called (perfect reconstruction) 
quadrature mirror filter (QMFs)if they satisfy the following orthogonality (or perfect 
reconstruction)conditions: 

HG* = GH* = 0, and H * H + G * G = /; 
where I is is identity operator. These conditions impose some restrictions on the 
filter coefficients {A*;} and {^.}.Let m 0 and m : be the bounded periodic function 
defined by 

L- 1 L-1 

mo(() = £ he ,l( , TOi(f) = £ (2-18) 

k = 0 k = 0 

Daubechies proved that H and G are QMFs if and only if the following matrix 
is unitary for all £ € 3t-' 

f m 0 (0 mo^ + Tr)'' 

K mU) m i (^ + x ) j 
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Various design criteria (concerning regularity, symmetry etc.) on the lowpass 
filter coefficients hk can be found. Once hk is fixed, we can have QMFs by setting 
9k = ( — 

This decomposition (or expansion or analysis) process is iterated on the low bands 
and each time the high frequency coefficients are retained intact. At the last iteration 
, both, low and high frequency coefficients are kept. In other words ,let x= {x k }lZl € 
be a vector to be expanded. Then , the convolution -subsampling operation 
transform the vector x into two subspaces Hx and Gx of lengths n/2. Next, the 
same operations are applied to the vector of lower frequency band to Hx to obtain 
H 2 x and GHx of lengths n/4.If the process is iterated J(< no) times , we have 
the discrete wavelet coefficients(Gx, GHx, GH 2 x, . . . GH J x, H J+l x)ai length n.As 
a result , the wavelet transform analyzes the data by partitioning its frequency 
content dyadically finer and finer toward the low frequency region(i.e , coarser and 
coarser in the original time domain). 

If we were to partition the frequency axis sharply using the charecter using the 
characterstic functions(or box-carfunctions),then we would have ended up the so 
called Shanon(or Littlewood-Paley) wavelets, i.e., the difference of two sine func- 
tions. Clearly, however, we cannot have a finite length filter in the time time domain 
in this case. The other extreme is Haar basis which partitions the frequency axis 
quite badly but gives theshortest filter length. (L = 2 with h 0 = hi = l/-\/2)in the 
time domain and which are suitable for describing discontinuous functions. 

The reconstruction (or synthesis)process is very simple : starting from the low- 
est two frequency bands H J+1 x and GH J x , the adjoint operators are applied and 
added to obtain H J x = H*H j+1 x + G*GH J x. This process is iterated to recon- 
struct original vector x. The computational complexity of the decomposition and 
reconstruction process is in both cases as easily seen. 

Because of the perfect reconstruction condition on H and G, each decomposition 
step is also considered as a decomposition of vector space into mutually orthogonal 
subspaces. For the purpose of explanation we will now follow a different notation 
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ft 


0,0 


r— fi 2,l 



r— °3,1 


ft 


2 , 0 - 


ft 


3,0 


Figure 2.1: A decomposition of fi 0 ,o into the mutually orthogonal subspaces using 
wavelet transform (with J=3).The symbols in bold represent the subspaces kept 
intact by the wavelet transform 


to identify a subspace. Let O 0 ,o denote the standard vector space R n . Let fl ]i0 and 
fii,i be mutually orthogonal subspaces generated by application of of the projection 
operators H and G respectively to parent space fio,o-Then, in general, jth step of 
decomposition process can be written as: 


® for j — 0, 1 , . . . J . 

It is clear that dimfL,, = 2 n °~- 7 . The wavelet transform is simply a way to represent 
f! 0 ,o by a direct sum of mutually orthogonal subspaces. 

( J \ 


fto.o — (D I ® 


(2.19) 


U - 1 


and the decomposition process is illustrated by Fig (2.1) 
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2.4 Wavelet Packet Bases 


For oscillating signals such as acoustics signals, however , the analysis by wavelet 
transform is not always efficient because it only partitions the frequency axis finely 
towards the low frequency. The wavelet packet transform is a generalised version 
of the wavelet transform: it decomposes even the high frequency bands which are 
kept intact in wavelet transform. They are much more oscillatory in nature as 
compared to the wavelet basis vectors. The first level of decomposition gener- 
ates Hx and Gx just like in wavelet transform. The second level generates four 
subsequences, H 2 x, GHx, HGx, G 2 x. If we repeat this process for J times, we end 
up having J nexpansion coefficients. It is easily seen that the computational cost of 
this whole process is about 0(Jn ) < 0(n log 2 n) This iterative process naturally 
generates subspaccs with different frequency localizaion characteristics. The root 
node of this tree is again Q 0 ,o- The node 0^* splits into the two orthogonal sub- 
spaces and by the operators H and G , respectively: 

= ^j+],2 k © flj-f-l ,2k+lf 'OVJ =0,1,... «/, k = 0, . . . 1. 

The Fig (2.2) shows the binary tree of subspaces of Q 0 ,o : 

Clearly, we have redundant set of subspaces in the binary tree. In fact ,it is easily 
proved that there are more than 2 2J 1 possible orthonormal bases in the binary tree. 
This binary tree (or as we see later the quad tree in case of images) is one of the 
most important tools. 

2.5 Selection of a “Best Basis” from a Library 
of Orthonormal Bases 

2.5.1 Information Cost Function 

An efficient coordinate system for representing signal should give large magnitudes 
along a few axes and negligible magnitudes along most axes when the signal is ex- 
panded into the associated basis. We then need a measure to evaluate and compare 
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Figure 2.2: A decomposition of fio,o into the tree-structured subspaces using wavelet 
packet transform (with J= 3). 


the efficiency of many bases. Let J denote this measure which is often called “in- 
formation cost” function. There are several choices for X. All of them essentially 
measure the “energy concentraton ” of coordinate vector. A natural choice for this 
measure is the Shanon entropy of the coordinate vector. Let us define the entropy 
of a nonnegative sequence p = {p t }with = 1 by 


H(p)A-J^ t p,\og 2 p t ... 


( 2 . 20 ) 


with the convention 0 • logO = 0 ( From now on, we use “log” for the logarithm of 
base 2.) For a signal x,we set p t = (| x, | /||x|| r ) r where || • || r is the t norm and 
1 < r < oo and define 
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Often r = 1 or r = 2 is used. In this thesis we will use r = 2. 


2.5.2 Algorithm for Best-Basis Selection 


The “best-basis” algorithm of Coifman and Wickerhauser(this will be discussed later 
in details) was developed mainly for signal compression. This method first expands 
a given signal into a specified library of orthonormal bases. Then a complete basis 
called a best 6asis(BB) which minimizes a certain information cost function such as 
entropy is searched int this binary tree using devide-and- conquer algorithm. 


2 n 0~J-l) 


( 2 . 22 ) 


Algorithm 2.1 (The Best-Basis Algorithm) Given a vector x 

Step 1 Choose a library of orthonormal bases V (i.e. specify QMFs for a wavelet 
packet dictionary) and specify the maximum depth of decomposition J and an infor- 
mation cost X 

Step 2 Expand x into the dictionary T> and obtain coefficients {B Jyk x.} 0<:< j y o<fc< 2 J-i • 
Step 3 Set A Jyk = B J<k for k = 0, . . . 2 J - 1. 

Step 4 Determine the best subspace A hk forj = J — 1, ... 0, k = 0, . . . V — 1 by 


A],k — 


B Jyk if X(B :iyk x) < X(A 3 +\ y 2kX- U Aj+i^-i-ixz) 

A J+ i, 2 k © i4 J+1|3 * +1 otherwise. 

To make this algorithm fast, the cost function 1 needs to be additive. 


(2.23) 


2.6 Mallat’s Decomposition Algorithm for 1-D 
Signal 

We have seen that as k € Z is the orthonormal basis for V 2 :, any function 
belonging to L 2 {R) can now be decomposed along all translates of <j>(2 :, x). The result 
is denoted by A ^ . Similarly, the function can be decomposed along all translates of 
x) and the result is denoted by D^j- If the actual signal is having N number of 
samples these results are having iV/2- 7 number of samples. Thus the resulting signals 
belong to T 2 ( ./V/^ ) . This is illustrated for one dimensional case and then extended 
to two dimensional case. 

The decomposed signal along all the translates of <f>(2 J x) is called the averaging 
signal and the decomposed signal along all translates of i/>( 2 3 x) is called the detail 
signal. This is due to the nature of finite sequences h(n) and g(n) used to generate 
the <j>{x) and ij'(x) respectively. 

2.6.1 Average Signal 

It is the inner product between the given function and translates of <f>{ 2 J x). It is 
denoted by 

(f(x),Vv</>( 2’x-k)), MkeZ 

The discrete version is given by 

{/(x t ), y/2^ <f>(2 :> x t — k)), Vk € Z and x t is position of i th sample. 
Equivalently, the inner product is written as 

= £*, f(x t )V¥(j)(^x t - k) 

= 2j<f>{-2’xS){2- J k), keZ (2.24) 

By changing the variables and denoting h(j) = — j)) and h*(j) = 

h(—j ) the Eqn (2.24) is written as 

= Zh*(2xi - k ) (/(*,) * Vv<f>{-2’ + 'x t )) k € Z 



where x, is the location of i th sample. 

This implies that it is the average signal at the j ttl resolution level obtained 
through convolution and decimation operation between the previous average signal 
(A*, + 1)/ and the h* sequence . 

2.6.2 Detail Signal 

The detail signal at the j th resolution step is the inner product between the given 
signal and the translates fo x/>(2 J x). It is denoted by 

(f(x),y/2jiK23x-k)),Vk€Z 

The discrete version is given by 

(f(x t ),V^^x t - *)),V k e Z and is the location of the 2 th sample 

of signal. 

Equivalently, 

= (/(: r.) * (2~ J k). k € Z. (2.26) 

By changing the variables and writing g(j) = (^> 2 -> (ar*), <t K x i~j)) and g*(j) = g(— j), 
Eqn. (2.26) can be written as, 

= Zs* (2i. - k) (/!':■>',) * (2~ 1 - I k), k E N. 

= Did) (2.27) 

This implies that the detail signal at j th step is obtained through convolution and 
decimation operation between the previous signal (A^ + 1)/ and the g* sequence as 
given in Fig (2.3). 

2.7 Mallat’s Reconstruction Algorithm for 1-D 
Signal 

As we have seen earlier that 

V v+ i — Ey © O2J • 
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Figure 2.3: Generation of detail signal at j th step. 
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Figure 2.4: Reconstruction of 1-D Signal 

So, V^+i can be constructed by the the orthogonal sum of approximation signal and 
detail sgnal in lower resolution stage. The reconstructed signal at (j + 1) resolution 
step is the inner product between the original signal and the translates of d>( 2- 7+1 :r). 
Mathematically reconstucted signal is given by, 

</(u), V¥^cj>{2^u - k)) (2.28) 

By changing variables and defining the h and g sequences Eqn (2.28) can be written 
as 

= Efc h(n - 2 A:) (/(it), - k)) + £* g(n - 2k){f(u), y/V^u - fr)) 

Efc + L k g(n - 2k)D d , (2.29) 

This implies that A d v+i {f) can be reconstructed by putting zeros between each 
samples of A d v (/) and D d v {f) and convolving the resulting signals with filters h and 
g respectively. The original signal can be reconstructed by repeating this procedure 
down to j = 0. This can be shown as in Fig (2.4) 


2.8 Application of Wavelet Packets to Images 

We have seen that a. bank of perfect reconstruction QMFs can be used for the 
purposes of analysis of a one-dimension signal to produce a dictionary of best basis, 


17 








Figure 2.5: Block Diagram for Convolution-Deconvolution 


as also to reconstruct the original signal from the best basis. The same concept can 
be extended to 2-D Images. 


2.9 2-Dimensional Operators 

Fig (2.5) is the traditional diagram describing the action of a pair of quadrature 
mirror filter.On the left is convolution and downsampling(by 2);on the right is up- 
sampling (by 2) and adjoint convolution, followed by summing of components. The 
broken lines in the middle represent either transmission or storage. 

Now we can define 4 2-dimensional convolution-decimation operators in terms of H 
and (7 namely the tensor products of the pair of quadrature mirror filters: 

F 0 AH <g) H F 0 v{x,y) = E,,A^ j) h ^+Ay+i 
FyAH®G F l v{x,y) = ^A i G) h ^+^2y+ 3 (2.30) 

F 2 AG®H F 2 v{x,y) = L/(m')? 2 i+.K+j 
F 3 AG <g> G FA*, V ) = A *> Dfrx+ifrv+j 

These convolution-decimation operators have the following adjoints: 

FQv(x,y) = Et tJ v(i,j)h 2t+x h 2:i +y 

FA* A = (2.31) 

F%v{x,y) = j)y 2t+x ^ +y 

j F£v(x,y) = Et,j v (i’j)y2i+xg2j+y 
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The orthogonality relations for the above are 


FJf = 6 nm 1 

i = f 0 *f 0 © f;f © f 2 *f 2 © f*f 3 


(2.32) 


By a picture we will mean any function S = S(x,y) € l 2 (Z 2 ) . In practice , 
of course, pictures will be nonzero only at finitely many points. The space l 2 (Z 2 ) of 
pictures may be decomposed into a partially ordered set W of subspaces W{n,m) 
, where m > 0, and 0 < ?i < 4” 1 . These are images of orthogonal projections 
compost'd of products of convolution-decimations. By putting IT(0, 0) = Z 2 , and 
defining recursively 


U'(4?! + ?, rn + 1) = F. 2 t+J F 2l+J W(n,m) for i = 0, 1,2,3 (2.33) 

Those subspares may be partially ordered by a relation which we define recur- 
sively as well. Wo say IT is a precursor of W (write IT -< IT') if they are equal or 
if W = F* F\Y for a convolution-decimation F in the set {Fo, Fj, F 2 , F 3 }. We also 
say that IT -< W if there is a finitesequence Vj , . . . , V n of subspaces in W such 
that W -< Vj V n -< W . This is well defined, since each application of F*F 

increases the index m. 

Subspaces of a single precursor IT £W will be called its descendents , while the 
first generation of descendents will naturally be called children . By orthogonality 
condition, 

it = FqFqW © f*f x w © f;f 2 w © f;f 3 w (2.34) 

The right hand side contains all children of IT. 

The coordinates with respect to the standard basis of lT(n, m) are in F(i) . . . F( TO )1T(0, 0), 
where the particular filters F( i) . . . F( m ) are determined uniquely by n. Therefore we 
can express in standard coordinates the orthogonal projections of 1T(0, 0)onto the 
complete tree of subspaces W by recursively convolving and decimating with filter. 

We may think of the quadrature mirror filters H and G as low-pass and high-pass 
filters, respectively. They can be described as nominally deviding the support set 


19 



w 


F 0 W 

F\W 



f 2 w 

f 3 w 


Figure 2.6: Decomposition of an Image into its First generation descendents 
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Figure 2.7: Decomposition of an Image into Two generation descendents 

of the Fourier transform S of the picture into dyadic squares. If the filters were 
perfectly sharp, then this would be literally true, and the children of W would cor- 
respond to 4 dyadic subsquare onne scale smaller as in Fig (2.6) 

Fig (2.7) shows two generations of descendents ,the complete decomposition of 
x . The subbands are labeled by the “n" index in W(n, m). Within the dyadic 
squares are the n indices of the corresponding subspaces at that level. If we had 
started with a picture of N x N pixels, then we could repeat this decomposition 
process log 2 (iV) times. 
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Figure 2.8: Decomposition of an Image into L levels 


2.10 Two-dimensional Wavelet Packets 

Wo have seen in the earlier chapter , a method for generating a library of or- 
thonormal bases by recursive QMF convolution-decimation. From an algorithmic 
point of view , t his is equivalent to recursive subband coding while retaining all inter- 
mediate subband decompositions. The coefficients produced at each stage are corre- 
lations of the signal with compactly-supported oscillatory functions called “wavelet 
packets 5 '. If perfect reconstruction QMFs are used , then the wavelet packets satisfy 
some remarkable orthogonal properties. 

From the tree W of subspaces we may choose a basis subset , defined as a collec- 
tion of mutually orthogonal subspares W €W or lists of pairs (n,m), which together 
span the root. Basis subsets are in one to one correspondence with dyadic decompo- 
sition of the unit square. Classical subband coding takes coefficients from a fixed set 
of subbands, usually from a single levelof trees as in figueres above. Wavelet trans- 
form coding also extracts coefficients from a fixed collection of blocks, the octave 
subbands, which are schematically represented by Fig (2.9) 

Fig (2.10) shows a typical dyadic decomposition down to level 4;its basis subset 

CENTRAL LfBRARV 

01 h I. T, KANPUR 


dm, Mm. 


Figure 2.9: Schematic representation of octave-subbands 


consists of the Hi pairs (0,1 ),(3,1 ) , (4,2), (5, 2), (6, 2), (7, 2), (8, 2), (9, 2), (10. 2) 
,(45, 3), (-Hi, 3), (47,3), (176, 4), (177, 4), (178, 4), and (179,4) 


2.11 Mallat’s Decomposition Algorithm for 2-D 
Signal 

Average Signal 

In this a separable scaling function is considered. The inner product between digi- 
tised input signal /(x,-,y t ) and 2 j x t — k)(j>(2 : yi) is given by 

= Ex, £». ffa,y*W V<t>{Vx t - k)<t>{Vy x - 1) 

= ((/(**, yi) * V¥^(-‘2 J x t ))(2~ j k) * <j>(-2 j y t j) (2 ~H), k,l € Z (2.35) 
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Figure 2.10: Decomposition of a two dimensional signal down to level 4 


By changing variable’s and defining h* sequence as in 1-D case the Eqn (2.35) can 
be written as, 

= }Zk,i h *(2.r, k)h m [2y t - l) (/(j\, yi) * \/^>(2 J+1 x t ) * <j>(2 1+1 y z )) (2' J_1 /), k, l € N 

= A d v (f) ( 2 ‘ 36 ) 

This implies that the average signal at the j th resolution step is obtained through 
double convolution and decimation operation between the previous average signal 
and h*(x,) and h*(y t ) sequences respectively as shown in Fig (2.12). 


Detail Signal 

Here we consider separable scaling and mother wavelet functions. As shown in [1],[2] 
we note that following functions are ortogonal to each other: 

• ^(ar)^(y) L<f>{x)i'(y) 
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Figure 2.11: Detail Signal at j tk resolution for 2-D image. 
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Figure 2.12: Average Signal at j th step for 2-D image. 

• V , ( ;r )V’(j/)- 1 -^(y)V , (-r) 

• Tj'(x)iliy)±<f>(x)<t){y) 

As in 1-D case we can write 

V-2J+1 = Vv © V 2 iJ © 0 2 ij © Op] . 

Where, 

0 2 i j is Vector space generated by translated functions of x^^y). 

0 2 ij is Vector space generated by translated functions of i/>(2 J a;)^(2 : y). 

0 2 zj is Vector space generated by translated functions of i/>(2Ar)i/>(2 J i/). 

If represents the detail signal at resolution 2 J along the i ik direction, we have 
the result as shown in Fig (2.11). The approximation or average signal at the j h 
resolution is obtained through the convolution decimation structure given in Fig 
( 2 . 12 ). 
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Figure 2.13: Reconstruction of 2-D Signal 

2.12 Mallat’s Reconstruction Algorithm for 2-D 
Signal 

Wo have 

^ 2 ! + J l 2> 0 W 1 ./ © Oiij © 0 2 *j- 

Where, 

O-^ j is Wet or space generated by translated functions of <f>(2 3 x)ip(2 3 y) 

Opj is Wet or space generated by translated functions of ip(2 3 x)<f>(2 3 y). 

O&j is Vector space generated by translated functions of ip(2 3 x)rj}(2 3 y). 

If I)! p represents the detail signal at resolution 2 3 along the I th then by Fig (2.13) 
we can get the original signal. 

In next chapter, we will see as to how the wavelet packets can be applied for the 
purpose of feature extraction in case of 2-D signals. 
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Chapter 3 


Application of Wavelet Packets in 
Feature Extraction 


In t lit* previous chapter we have seen as to how we can apply wavelet packets to 
two dimensional signals. In this chapter we will use these two-dimensional wavelet 
packets for the various aspects of feature extraction as discussed in chapterl. We 
will consider t hose aspects in the subsequent sections. We have used two dimensional 
separable wavelets throughout the thesis work. The wavelets that have been used 
are:- 


• Haar 

• (loiflet 

• l)aubechies-4 

• I)auhechies-8 

• I)aubechies-20 

The scaling functions <f> and the wavelet function V ; ' n respect of these wavelets are 
given as Fig (3.1). 
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wavdet function scaling function 
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wavelet function scaling function 
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Figure 3.1: Scaling functions and Wavelets used in this thesis 
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3.1 Development of 2-D Wavelet Package for Anal- 
ysis of Images 

A package has been developed for the purposes of analysis and processing of images 

using wavelet packets. The features of this package are described as follows. 

(a) It analyses both 1-D and 2-D signal using a variety of wavelets to generate a 
library of orthonormal bases. In the case of 2-D signals separable filters are 
used. The package also reconstructs the signal from the above library. 

(b) It selects the ‘best-basis’ from the above library. The algorithm for selection 
of ‘best-basis’ is given in subsection (3 2.1). Once the best-basis has been 
selected, the package is then used to compress the same and code the additional 
information required to be transmitted along with the best-basis. 

(c) The package also computes the energy distribution of a signal and then displays 
this information as shown in Plates (6) and (7). 

(d) From the comparative displays of noise energy and received signal energy we 
can decide on the subspaces to be dropped or thrsholded as explained in sub- 
section (3.7.1). Based on the algorithm discused in subsections (3.7.1)) and 
(3.7.2). The package then reconstructs the signal from the ‘noisy signal’. 

(e) Provisions to do edge enhancement, edge detection and rescaling of image is 
also included with the package developed. 

3.2 Image Compression 

Traditional image compression techniques have been designed to exploit the stastical 
redundancy present within real world images. DCT, DPCM and the entropy coding 
of subband images are all examples of this stastical approach. Removing redundancy 
can only give a limited amount of compression. To achieve high ratios some non- 
redundant information must be removed. The statastical coders produce 
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an annoying visual degradation when operating in this mode because they produce 
errors in visually important^ parts of the image structure such as edges. 

By using wavelets basis’ for image compression some success was achieved to 
overcome the above problems. In this thesis we have discussed yet another approach, 
viz, use of wavelet packet bases for the purpose of image compression. As w T e will 
see subsequently, wavelet packets provide more flexibility and better compression. 
As we have discussed in the previous chapter, we use 4 best-basis’ algorithm for this 
purpose. The steps involved in this process are:- 

(a) Analyse (he 2-D signal using library of two dimensional wavelet packets, and 
form a quad-tree of depth L. 

(b) From the above library, select the best-basis using some information cost func- 
tion. 

(c) Compress the best basis by selecting a threshold. 

These processes are discussed in the subsequent sections. 

3.2.1 Selection of Best Basis in Library 

We have seen the algorithm for selection of best-basis in the last chapter. Now 
we will modify the same for a 2-D signal. Set a predetermined deepest level L. 
Label each subspace at level L as “kept”, i.e. ,the subspaces indexed by (n,L) for 
0 < n < A 1 . Next, we set the level index to L — 1. The information cost of the 
subspace W(n,m) s then compares with the sum of the informaton costs of its 
children W(An, m +1), W(An + 1, m + 1), W(An + 2, m-f 1), W(An + 3,m + 3). If the 
parent is less than or equal to the sum of the children, then the parent is marked 
as “kept”. This means that by choosing the parent rather than the children, we 
will have fewer coefficients above threshold in the representation of S. On the other 
hand, if the sum of children is less than the parent, we leave the parent unmarked 
but attribute to her the sum of children’s information costs. By passing this along, 
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prior generations will always have their information costs compared to the least 
costly desrendents. 

After all the suhspaces at level m = L — 1 have been compared to their children 
,the level index is decremented and we continue the comparison. We can proceed 
this way until we have have compared the root W(0,0) to its 4 children. We claim 
that the topmost “kept” nodes W with no kept precursors is a basis subset which 
minimizes the information cost. This can easily be proved by induction. 

3.2.2 Compressing a Best-Basis Representation 

In practice, it is necessary to compress the signal due to constraints like bandwidth. 
Hence, the best -basis information, determined by the algorithm discussed above has 
to be further compressed. Methods to achieve this compression are as follows. 

Existing Method 

This method has been suggested by Wickerhauser [4]. Suppose that B is a best-basis 
subset of H '. Wo may then extract just non-negligible coefficients and transmit them, 
together with then r locations in the tree. This number of coefficients is no greater 
than the number of pixels, since it is chosen after comparison with the original basis, 
among others. In that case we may use the entropy cost function to obtain the most 
concentrated representation, and then take only as many of the largest coeficients 
as wo ran afford. This may be accomplished by first sorting into decreasing order by 
absolute value, then reading off the desired number of coefficients. Alternatively, since 
we know in advance how many coefficients we can use, it may be more efficient to 
bubble up the top few coefficients and discard the rest of the array. The second 
method is better if the number of retained coefficients is less than log N , where N 
is the number of pixels. 
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Method used in the Thesis for Compressing the Best-Basis 

In the present thesis we have used energy criterion for compressing the best-basis. 
Once the best-basis has been selected, we calculate the normalized energy of each 
subspace. We then find the maximum normalized energy and select a threshold t as 
some fraction / of this maximum energy. Then, we discard all the subspaces having 
energy less than this e. The fraction, / has been selected based on the desired SNR 
and peak SNR of the image. 

3.2.3 Efficient Coding of Side Information which Describes 
Best-Basis 

The transformation from a 2-Dimensional signal to its best-basis representation is 
non-linear since the choice of basis depends upon the signal itself. Together with 
the coefficients we must include the extra information describing which basis was 
selected. 

There are atleast 2 ways to include the information about the basis that were 
selected. Best-basis coefficients may be individually tagged with their coordinates in 
the best-basis tree. Suppose this tree begins with an TV x TV signal and decomposes it 
down to level L, where L < log TV. Then there are LN 2 wavelet packets coefficients, 
and it takes log 2 N bits to encode. 

Alternatively, we may agree upon an ordering of coefficients, write out the coef- 
ficients in this order after quantization , and then entropy code the entire list. If 
we were to use a single basis like wavelets then we need never explicitly tag any 
coefficients. We obtain compression because the quantized stream of coefficients has 
a lower entropy than that of the original signal. 

We shall now discuss second method, whereby we will include some side infor- 
mation which describes the chosen basis, and we shall then write all the quantized 
coefficients from that basis out into a stream for entropy coding. This method is 
substantially more efficient , and is esential for a competitive picture compression 
algorithm. 
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igliri' a.2: Till' subspacos array stacked over one another 


Describing the basis quantizing all the coefficients Imagine L+ 1 arrays 
of A’ x A' numbers. The first, array represents the original signal, which we may 
( . a ll %. the second is a concatenation of 4 subspaces obtained via sepearable filter 
convohition-derimalion,i.e. , the spaces F 0 {X)F 0 (Y)Z, F x (X)F 0 (Y)Z , F 0 (X)F l (Y )Z, 
and F l {X)F l {Y)Z. Array r» represents the concatenation of the 4 m subspaces that 
make up level m of the wavelet packet decomposition. Ofcourse,we must satisfy the 
condition 0 < m <L< log N. These arrays can be visualised as stacked one atop 

other (as shown Fig (3.2)). 

Suppose that from this collodion of arrays we have chosen a best-basis. This 
will be a subset of the coefficients having the property that if one element of a 
suhspace is in the basis , then whole subspace is in the basts. Such a subset can 
be identified with a cover by dyadic subarrays. Looking down through the stack of 
arrays, this cover gives a tiling of the original N x N array by square subarxays of 
size 2-"JV x 2-‘N, where m is the level from which that particular subspace was 

chosen. 
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Levels map Here 2 arrays are used. One describes, what is called a “levels 
map" while the other describes a “coefficient list”. Levels map has 2 2L integers 
of length log 2 (l + L ) bits each, and describes the level from which a corresponding 
N2~ l x N2~ L block of coefficients in the next list was chosen. The coefficient list 
consists of IV x IV coefficients from the best-basis, scanned in some agreed upon 
pattern. 

If the original signal turns out to be the best-basis representation, then every 
coefficient will be chosen from levelO, the levels map will consists of 2 L x 2 h 0's 
, and the coefficients list will contain the original signal. On the other hand, if 
complete bottom level L turns out to give the best-basis, then levels map will contain 
2 L x 2 l IS s and the coefficient list will contain all the coefficients from the bottom 
level, level L. In the case of wavelet basis turning out to be the best, the levels map 
will contain a de.sniption of the wavelet basis and the coefficients list will contain 
coefficients of the signal. 

The estimate overhead cost can be calculated. There are A L additional integers 
each of length log 2 (l + L ) bits. 'This gives a total of 4 i log 2 (l + L)/N 2 bits per 
pixel, which can be manipulated by making L < log 2 N. If we start with an image 
128 x 128 and decompose it to level L = 7 then the number of overhead bits per 
pixel is 3. 

Subspaee lists A basis can also be described by listing the subspaces it con- 
tains. One method is to list subspaces by level. 3 arrays are used: 

(a) an array num[L] of integers giving the number of subspaces chosen at each 
level, 

(b) an array of arrays subspace[ ][ ], listing the subspace chosen from each level, 
and 

(c) an array containing the complete set of chosen coefficients. 

The array num[ ] in (a) contains L entries of varying length, that is, 1 bit for entry 
0, which describes whether level 0, 2m bits for each entry m which tells as to how 
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many of the 4 m subspaces on level m are in the best-basis; and so on up to 2 L bits 
for level L. Together the subspares must account for all the coefficients. Hence we 
have the relation: 

num[0] * iV 2 -f num[l] * jV 2 /4 + num{2) * N 2 / 4 2 + . . . + num[L } * N 2 /4 L = N 2 , (3.1) 

This implies that rmm{L] = 4 h (\ - nurnfO] - ... - num[L - 1]/4 L_1 ), so it is not 
necessary to transmit this value to describe the basis. The total number of bits in 
the array num[ ] is thus 

L - 1 

1 +2j>n= 1 + (Z,-1)Z, = Z, 2 -L + 1 (3.2) 

m=l 

The array subspacef ] of arrays m (b) contains L entries of length num[0],num[l] 1], 

. . .,num[L-l], respectively. Array subspace[7??][ ] contains integers of length 2m bits; 
an ay subspace[0] not'd not be allocated, since there is unique subspace at level 0 
Similarly array subspace[L] need not be allocated. Suppose the subspace numbering 
scheme assigns the indices 4A% 4k + 1, 4k + 2 , and 4k + 3 to the subspaces descended 
from 4k. The actual subspaces in level L be labelled by integers 0, ...,4 L , and the 
ones that are actually present are the survivors after indices 4 L ~ m k, . . . ,4 L ~ m (k + 
1) — 1 are deleted for each subspace k at level 0 < m < L. 

With this scheme the total number of bits in subspace[ ][ ] is 

L - 1 

y~) 2 m * = 2 (L — 1) * 4 L_1 

m=l 

Hence the overhead for this coding scheme requires at the most 

(L 2 — L + 1 + 2(L - \)4 l ~')/N 2 (3.3) 

bits per pixel. This can be controlled by limiting L. In case of a 128 x 128 image 
analysed down to level 7 this value is 3. 
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3.3 Reconstruction 


A picture represented as coefficients may be reconstructed by calculating the value at 
each point of the appropriate linear combination of wavelet packets. This is done as 
fllows. First we allocate enough memory for the deepest level of the tree of subspaces 
that contain retained coefficients, and insert the coefficients into their apropriate 
locations. Then we reconstruct the parent subspaces by applying the adjoints of 
the convolution and decimation operators, which produces part of the next deepest 
level. Into this we add the retained coefficients which belong in that level, at their 
respective locations, and reconstruct the parent at this level. We continue in this 
manner until the root has been constucted. Fig (2.13) shows reconstruction from 
one level. 


3.4 Complexity of the Algorithm 

Suppose S is an iV-element picture. Applying convolution-decimations to generate 
the tree of coefficient sequence requires 0(N log N) opreations. Calculating infor- 
mation costs has complexity 0(N log N). Labelling “kept” subspaces is equivalent 
to a breadth-first search through the tree, which has complexity 0(A r ). Locating 
topmost “kept” subspaces is equivalent to depth-first search, with complexity of 
O(N) , and filling an output register with coefficients from the best-basis takes 
an additional O(N) operations. We can then perform radix sort to determine the 
largest coefficients in the utput register, which has a complexity of 0(N log N). 

Reconstruction from the retained coefficients has the same complexity as gener- 
ating all the coefficients, since the requirement of reproducing the entire tree still 
remains. 
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3.5 Results of Image Compression 

3.5.1 Determination of Compression Factor 


The compression factor is determined in the following manner:- 

(1) First we calculate the extra number of bits needed to send the additional 
information by Eqn (3.3). 

(2) Then we find the max value of energy and accordingly decide on a threshold. 

(3) We then discard the subspaces having energy less than the above threshold. 

(4) Now, the total number of bits per pixel, if the image was not to be compressed 
as above depends on the number of quantization levels Q and is equal to 8 if 
the quantization levels are 256 (log 2 Q). Thus, the total number of bits to be 
transmitted in this case is equal to total number of data samples multiplied 
by number of bits per pixel. 

(5) The total number of bits needed for transmission, if the image were to be 
compressed, using best basis depends on the number of samples retained after 
steps 2 and 3 above, and the extra number of bits per sample required as per 
step 1 above. Numberof bits per sample , then, is equal to the number of bits 
per sample obtained through quantization, (as in step 4) and the extra bits 
per sample required, for transmitting additional information. This figure is 
then multiplied by the number of samples retained. 


(6) Hence the compression is defined as 


Compression Factor 


Total No of bits obtained m step 5 
Total No of bits obtained m step 4 


3.5.2 Results 

For the purposes of testing the above results, three images were used, viz, Baboon, 
Nutan and Lenna. 
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Compression 

Figure 3.3: Plot of Compression achieved Vs Mean Square Error for various wavelets. 

It is seen that, the compression factor varies with the type of image selected. 
The results thus achieved have been tabulated in Tables (3 1) ans (3.2). PSNR 
mentioned above is calculated as under:- 

PSNR=-20Iog 10 ( - — j 

where n is the number of bits per pixel. Plate (1) shows the original Baboon image. 
Plates (2), (3) and (4) show reconstructed image using Da.ub-20, Coiflet-6 and Haar 
filter respectively. 

The results achieved depend on the type of wavelets used. The comparative 
results of compression achieved by using different types of mother wavelet are shown 
in Fig. (3.3. The criteria for the selection of these wavelets has been discussed in 
section (3.10). The wavelet function and scaling function for these wavelets has 
been given in section (3.10). At this stage we will suffice by tabulating regularity 
and variance in respect of various types of wavelets used and observing the fact that 
regularity and variance do have a direct bearing on the image reconstruction and 
hence on compression achieved. The detailed discussion is postponed till the above 
quoted reference. 
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Filter 

Compression 

Mean Square Er- 
ror 

PSNR 

Haar 

0.06 

0.015 

30.5 dB 


0.20 

0.01074 

31.9 dB 


0.68 

0.004645 

35.93 dB 

Coiflet-6 

0.06 

0.01422 

30.68 dB 


0.12 

0.013676 

31.01 dB 

■■■ 

0.41 

0.008203 

33.10 dB 


0.67 

0.004305 

35.8 dB 

Daub-4 

0.06 

0.0133 

30.96 dB 


0.13 

0.01303 

31.06 dB 


0.40 

0.008016 

33.17 dB 

Daub-8 

0.06 

0.011142 

31.74 dB 


0.13 

0.0.010591 

31.96 dB 


0.40 

0.007979 

33.11 dB 


0.64 

0.004102 

36.03 dB 

Daub- 20 

0.06 

0.01098 

32.11 dB 


0.13 

0.009161 

32.75 dB 


0.40 

0.007210 

33.65 dB 


0.69 

0.002360 

36.9 dB 


Table 3.1: Results for the compression achieved. Image used is Baboon 
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Filter 

Compression 

Mean Square Er- 
ror 

PSNR 

Haar 

O.U 

0.00247 

36.6 dB 

(’oitlet-6 

0.00 

0.008649 

31.32 dB 


0.13 

0.006338 

32.66 dB 


0.35 

0.002360 

36.96 dB 

Daub 1 

(1.00 

0 006251 

32 72 dB 


0.13 

0.00425 

34.22 dB 


0.13 

0.001901 

37.89 dB 

Daub S 

0.0(i 

0.005558 

32.85 dB 


0.13 

0.005069 

34.99 dB 


0.43 

0.001790 

38.15 dB 

1 ^ 

Daub- 20 

0.05 

0.007655 

31.84 dB 


0.13 

0.003793 

34.89 dB 


0.31 

0.001527 

38.84 dB 


Table 3.2: Results for the compression achieved. Image used is Nutan 
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Table (3.3) compares regularity and spatial variance for various types of wavelets. 



Haar 

C-6 

D-4 

D-8 

D-20 

Regularity 

1.0 

1:22 

1.41 

1.7755 

2.64 

Variance of 

0.0832 

0.0986 

0.1416 

0.179 

0.2192 

,t 







Table 3.3: The comparative table of the various types of wavelets used. 


3.6 Edge Charecteyisation 

Sharp transitions in images are preserved and depicted extremely well in wavelet 
expansion. Hence edges can be lHeated very effectively as they correspond to sharp 
transitions. Edge characterisation' has two aspects, viz, edge enhancement and edge 
detection. In wavelet packets representation, as opposed to the wavelet representa- 
tion, the detail signal is also further analysed along with the average signal. This 
results in increased flexibility andrgreater manoeuvreability of the edge information. 
This feature of wavelet packets decomposition can be exploited for the purpose of 
edge charecterisation. 

Let us assume that a picture S has a resolution L. Let (n,m,&) be the index 
of an amplitude in the complete: wavelet packet expansion, where, m = 0, 1, . . . L. 
Suppose we wish to detect an edge of scale mo in this picture, where a white region 
darkens to black in a distance 2 7V ’ p ~ L . Such an edge will contribute large amplitudes, 
to scales, 1,2, .. . m 0 at high frequencies. Such an edge can be graphed by selecting 
only those amplitudes c(n, m, &). above a threshold, sufficiently large, with m < m 0 
and n greater than an appropriate monotone function of m. In other words, we 
simply analyse the image down to the desired level L. Then we apply the above 
threshold to all the subspaces corresponding to the detail signal at that level. And 
then the signal is reconstructed with the help of the modified subspaces. If only edge 
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information is required then the average signal ran be neglected during the process 
of reconstruction. Plate (5) shows edge of Lenna image detected in similar fashion. 

Edge enhancement can also be achieved in the similar fashion Here we analyse 
the signal as above. Now we suitably rescale all the subspaces corresponding cor- 
responding to the detail signal while leaving the average signal unchanged. We can 
then reconstruct the signal using modified subspaces. 

Along with the flexibility of selecting the resolution level, (which is also a feature 
of wavelet basis), wavelet packets also provide the flexibility of selection of subspaces 
at each level. Thus, depending on the charecterstics of the edges required as also 
on the number of edges desired, we can select the resolution level, subspace and the 
threshold for edge charecterisation 

3.7 Noise Removal 

Denoising is a major problem in image processing. In this thesis, we make use of a 
visual analysis tool for the purpose of denoising. The visual tool used here is the 
subspace energy display of the noise and the noisy signal. This method is discused 
in the next subsection. 

3.7.1 Noise Removal using Subspace Energy Display 

If the noise that is likely to effect the image can be modeled then it is possible 
to reduce the noise from the 2-D signal affected by the above noise. The ‘noise’ 
can analysed using wavelet packets and the energy distribution in various subspaces 
at various levels can be displayed. The ‘noisy’ image that is received can also 
be analysed in the similar fashion. From the visual information about the noise 
distribution available from the display above, the subspaces that correspond to more 
noise energy can be found out. The corresponding subspaces can be either dropped 
or we can apply a thrshold so as to neglect the coefficients that are likely to contribute 
towards noise. The algorithm for selection of threshold is discussed in the next 
subsection. The above energy displays also give a clear picture of the comparative 
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distribution of energy in various levels. We can then select only that level where 
subspaces W'ith higher proportion of noise energy correspond to that with lower 
proportion of signal energy and then carry out the above thresholding. The level so 
selected is then used for the purpose of reconstruction. As will be seen in the results, 
an SNR improvement of about 6dB is very easily possible. Since, the above display 
uses varying grey level intensity to depict the energy distribution, it facilitates in 
identification of the noisy subspaces and the resolution level to which they belong. 


3.7.2 Use of Hypothesis Testing Technique for Calculation 
of Threshold 


The procedure followed above relies on the visual methods for determining the 
threshold. Hypothesis testing technique can also be used for the purposes of de- 
termining the above threshold [12]. Consider an information signal , s(t), cor- 
rupted by Additive White Gaussia Noise (AWGN), w(t), with variance a 2 . Then 
our received signal is r(t) = s(t) + w(t) and the discrete equivalent of the same 
is r(k) = s(k ) + w(k). If r(k) has DWT r l n where l E L and L corresponds to 
maximum level to which the signal is decomposed, and n E N where N is theset of 
points available for each level, then, because DWT is linear operation we have 

r ‘n = * l n + W ‘n 

where s l n is DWT of s(k ) and w l n is DWT of w(k). Then by using Neyinan- Pearson 
criterion we determine a threshold based on the desired p robability of false alarm. 
The result of hypothesis testing is the detected parameter set r l n . where 


r 


i 

71 


r 'n if I r i l> <M0 

0 if K |< th(l) 


and th(l ) is the threshold at level 1. As has been shown in [12], low pass filter h(n) 
increases the mean of the noise at the output while, high pass filter g(n) yields a zero 
mean. However, the variance of noise through each filter remains same, provided, 


£( M <>)) 2 = £« n )) 2 = 1 

n n 
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The above holds true for orthogonal wavelets. In case of non-orthogonal decompo- 
sition the variance at each level can he calculated. Let PRi, PWi , and PSi denote 
received signal energy, noise energy and information signal energy respectively. We 
may now calculate these values as under:- 

P = !>' ) 2 ; pw, = i psi = EtO 2 (3.4) 

n n n 

We can now estimate the expected signal energy as 

E{PS,} = PR t - E{PWi} (3.5) 

This signal energy estimate is used to determine the hypothesis testing thresh- 
old. Different thresholds will detect diferent sets of parameters. We will find a thresh- 
old that yields a parameter set with an energy that is equivalent to that of the signal 
estimate. That is, let 


PR: = T.KY 


and vary the threshold th(l) until PRi = E{PSi}. Once this has been calculated 
we can obtain the signal estimate as 

m = IDWTfP] (3.6) 

The same algorithm can be applied to non orthogonal wavelets, as also to non 
gaussian noise. 

As we have seen earlier also, wavelet packet analysis gives us greater flexibil- 
ity even for the noise removal. We have more number of subspaces available for 
manipulation. 

3.7.3 Results of Noise Removal 

We have used both, 1-D as well as 2-D signals for the purpose of denoising. The 
results are tabulated in Table (3.4). In the case of 1-D signals (Sine wave and ECG) 
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Original Signal 



0 50 100 150 200 250 


Noisy Signal 



Figure 3.5: Effect of denoising on Sine Wave signal 
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SNR before de- SNR after denois- Improvement 














the same has been shown in Fig (3.4), and Fig (3.5). Plates (6) and (7) show the 
energy distribution for white noise and ‘noisy’ Baboon image respectively.i Plate (8) 
shows ‘noisy’ Baboon image with a SNR of 16.22dB. Plate (9) shows the effect of 
denoising. The SNR of this reconstructed image is 22.10dB. 


3.8 Local Image Rescaling 

For the purpose of highlighting a certain region in the image we resort to local image 
rescaling. For this purpose, we simply replace c(n,m,k) (notation used is same as 
that used for edge detection), with c(ri\m',k') for a restricted range of k's. 

3.9 Quality Criteria for Wavelets used in Image 
Processing 

In this section we will try to study certain criteria which would dictate the selection 
of wavelets for processing image signals. 

3.9.1 Wavelet Regularity 

The regularity of mother wa.velet is a very important aspect and is closely related to 
the regularity of the signal to be processed. Since most of the images are smooth (but 
for the occasional edges) it looks reasonable to use regular wavelets. Because scaling 
function 4>{t) and wavelet function are related, they share the same regularity 
properties. Therefore, in this discussion, we restrict ourselves to the study of <?5>(f), 
which depends only on low pass filter H(z). Zeros at the Nyquist, frequency play 
an important role in determining the regularity. One zero in H(z) at z = — 1 is 
necessary to obtain continuity. To achieve a regularity order of N , H(z) must have 
at least N + 1 zeros at Z — —1. However, if there are other zeros also present, then 
they contribute towards killing the effect of regularity caused by zeros at rr = — 1. 
This effect can be as much as 90%. Rioul [13] studied this effect and derived an 
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algorithm to estimate the optimal regularity esimate. The same is given below. 

(a) Remove all the zeros of II (z) at 2 = -1. These zeros will contribute to a 
maximum Holder regularity of A', the actual number of zeros at z = — 1. 

(b) The remaining factor F(z) = (1 + r _1 ) _A H(z) will decrease this regularity. 
The decrease will be by the amount 1 — a. 

(c) or, mentioned above is calculated by first iterating F(z) as under: 

F'{z) = F(z)F(z 2 )...F{z 2 '- 1 ) 

Associated with this F'(z ), we have time sequence /*. Then, a,, is calculated 
as under: 

= 1 A h)g 2 max,, | Pn+vk I 

A 

(d) Regularity r is then given by: 

r = A' — 1 + O', (3 7) 

In practice, 1 is iterated 20 times to get the regularity estimate. We have already 
seen regularity estimates of some of the wavelets in Table (3.3). We have observed 
that higher regularity of the wavelet results in better compression ratio. 

3.9.2 Number of Vanishing Moments 

Number of vanishing moments N , of the wavelet V’ forms an other important crite- 
rion. This is a measure of the oscillatory character of the wavelet. The number of 
vanishing moments is defined as 

J x n x/j(x)dx = 0 Vn = 0, 1,... ,7V — 1 (3-8) 

Since ij' is a wavelet, N is greater than or equal to 1 and the property of the 
wavelets, viz, 4 , ( x )dx — 0 is verified. In fact, this property of number of 
vanishing moments is related to regularity of the wavelet. That is, any r--regular 
multiresolution analysis of the wavelet will generate wavelet tj> with r + 1 vanishing 
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moments. Also, all derivatives of $(u>) upto the order N — 1 are zero at the point 
a; = 0. A wavelet with N vanishing moments enables the cancellation of all wavelet 
coefficients of a polynomial signal whose degree is less than N. Thus, if / is a 
polynomial signal of degree less than N, on support of i/'m,n then 


n {/) — 1 ) — 0 


(3.9) 


3.9.3 Spatial Charecterization of the Scaling Function 

To determine how 4>(x) evolves with respect to x , we introduce the criterion 772 1 
which allows us to charecterize the scaling function <p spatially. To define m*, we 
introduce the function p(x ) such that: 

P( x ) = -j - with p(x) > 0 and J p(x)d(x) = 1 (3.10) 

mk is then defined as 

rrik = J (x — m])* p(x)dx with 772,] = J xp(x)dx (3-11) 

Thus 772] is equal to the mathematical expectation of p{x) and m 2 corresponds to 
the spatial variance of the scaling function (j). m 2 allows us to determine the energy 
concentration of <j> and provides information on the spatial length or localization of 
<f>. This criterion also applys to the wavelet rp. As we have seen in Table (3.3), along 
with Tables (3.1) and (3.2) higher spatial variance results in better results. This 
is because of better energy cocentration associated with the wavelets with higher 
spatial variance. 
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Plate 8 :Noisy Baboon Image. SNR is 16.22 dB 


Plate 9 : Baboon Image after denoising. SNR is 22.20 dB 
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Chapter 4 


Conclusion and Suggestions for 
Future Work 

4.1 Conclusion 

Wavelet Packets have come to be recognised as an important tool in image process- 
ing. In this thesis the same have been successfully exploited for feature extraction. 
Preliminary results obtained are satisfactory, therefore following conclusions can be 
drawn : 

• An image can be analysed using wavelet packets to generate a ‘quad-tree’ 
of wavepacket coefficients. From this tree we can select a ‘best-basis’ which 
minimizes a certain cost function, e.g. entropy. This representation can then 
be used to obtain reasonable amount of compression by applying a certain 
threshold to neglect subspaces whic have insignificant energy. 

• The compression achieved in this process depends on the following factors: 

(a) Desired signal-to-noise ratio. 

(b) Type of mother wavelet used. That is, the regularity of the wavelet used 
for the purpose of analysis. It has been observed that higher regularity 
results in better compression. 
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• The ‘noise’ present, on the channel ran he modeled and analysed using wavelet 
packets. The signal received on the channel (‘noisy signal’) can also be anal- 
ysed in the similar way. The ‘quad-tree’ obtained above ran then be used 
for generating subspace-energy displays for both, noise as well as the received 
signal. These displays then form an important visual tool for the purpose of 
denoising. 

• Wavelet packets, provide a wider choice of subspaces. This feature can be 
effectively used for edge-chareterisation and rescaling a local region in image. 
By selecting appropriate ‘average signal’ or ‘detail signal’, at appropriate level 
and then rescaling/thresholding we can achieve these aims. The number of 
edges and their charecteristics depend on the level of image decomposition 
and the threshold. 


4.2 Suggestions for Future Work 

It is seen that in image compression, the type of mother wavelet selected has an 
important effect on the compression ratio achieved. We have used only a few of 
them. It is suggested that the same algorithm can be tried using some other t}-pes 
of orthogonal wavelets, as well as non-orthogonal wavelets. 

In this thesis, while using the concept of multiresolution analysis for the pur- 
poses of generation of library of bases, we have always used a resolution factor of 
2 (dyadic resolution) along with separable filters. Now the studies have been made 
using a resolution factor of yC*2 (quincunx resolution). This type of resolution uses 
non-orthogonal, non-separable filters and will give a finer resolution. Hence, the 
algorithm discussed in this thesis can be tried using the same. 

The package developed for noise removal uses varying gray level intensities to 
display ‘subspace-energy’ distribution. The same can be modified to display the 
energy distribution using different coloures as human eyes are more sensitive to 
colours. The same package can be made more interactive. 
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